Overview

Brought to you by YData

Dataset statistics

Number of variables21
Number of observations9,994
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.6 MiB
Average record size in memory168.0 B

Variable types

Numeric6
Text6
DateTime2
Categorical7

Alerts

Country has constant value "United States"Constant
Category is highly overall correlated with Sub-CategoryHigh correlation
Discount is highly overall correlated with ProfitHigh correlation
Postal Code is highly overall correlated with Region and 1 other fieldsHigh correlation
Profit is highly overall correlated with Discount and 1 other fieldsHigh correlation
Region is highly overall correlated with Postal Code and 1 other fieldsHigh correlation
Sales is highly overall correlated with ProfitHigh correlation
State is highly overall correlated with Postal Code and 1 other fieldsHigh correlation
Sub-Category is highly overall correlated with CategoryHigh correlation
Row ID is uniformly distributedUniform
Row ID has unique valuesUnique
Discount has 4798 (48.0%) zerosZeros

Reproduction

Analysis started2025-11-16 15:49:59.405956
Analysis finished2025-11-16 15:50:14.508055
Duration15.1 seconds
Software versionydata-profiling vv4.17.0
Download configurationconfig.json

Variables

Row ID
Real number (ℝ)

Uniform  Unique 

Distinct9994
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4997.5
Minimum1
Maximum9994
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size78.2 KiB
2025-11-16T17:50:15.073569image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile500.65
Q12499.25
median4997.5
Q37495.75
95-th percentile9494.35
Maximum9994
Range9993
Interquartile range (IQR)4996.5

Descriptive statistics

Standard deviation2885.1636
Coefficient of variation (CV)0.57732139
Kurtosis-1.2
Mean4997.5
Median Absolute Deviation (MAD)2498.5
Skewness0
Sum49945015
Variance8324169.2
MonotonicityStrictly increasing
2025-11-16T17:50:15.373822image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
11
 
< 0.1%
66661
 
< 0.1%
66591
 
< 0.1%
66601
 
< 0.1%
66611
 
< 0.1%
66621
 
< 0.1%
66631
 
< 0.1%
66641
 
< 0.1%
66651
 
< 0.1%
66671
 
< 0.1%
Other values (9984)9984
99.9%
ValueCountFrequency (%)
11
< 0.1%
21
< 0.1%
31
< 0.1%
41
< 0.1%
51
< 0.1%
61
< 0.1%
71
< 0.1%
81
< 0.1%
91
< 0.1%
101
< 0.1%
ValueCountFrequency (%)
99941
< 0.1%
99931
< 0.1%
99921
< 0.1%
99911
< 0.1%
99901
< 0.1%
99891
< 0.1%
99881
< 0.1%
99871
< 0.1%
99861
< 0.1%
99851
< 0.1%
Distinct5009
Distinct (%)50.1%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
2025-11-16T17:50:15.844367image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length14
Median length14
Mean length14
Min length14

Characters and Unicode

Total characters139,916
Distinct characters15
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2,538 ?
Unique (%)25.4%

Sample

1st rowCA-2016-152156
2nd rowCA-2016-152156
3rd rowCA-2016-138688
4th rowUS-2015-108966
5th rowUS-2015-108966
ValueCountFrequency (%)
ca-2017-10011114
 
0.1%
ca-2017-15798712
 
0.1%
ca-2016-16533011
 
0.1%
us-2016-10850411
 
0.1%
ca-2015-13133810
 
0.1%
ca-2016-10573210
 
0.1%
us-2015-12697710
 
0.1%
us-2015-1634339
 
0.1%
ca-2017-1409499
 
0.1%
ca-2015-1584219
 
0.1%
Other values (4999)9889
98.9%
2025-11-16T17:50:16.337489image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
125510
18.2%
-19988
14.3%
015492
11.1%
215381
11.0%
C8308
 
5.9%
A8308
 
5.9%
67904
 
5.6%
77438
 
5.3%
47400
 
5.3%
57338
 
5.2%
Other values (5)16849
12.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number99940
71.4%
Dash Punctuation19988
 
14.3%
Uppercase Letter19988
 
14.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
125510
25.5%
015492
15.5%
215381
15.4%
67904
 
7.9%
77438
 
7.4%
47400
 
7.4%
57338
 
7.3%
35449
 
5.5%
84042
 
4.0%
93986
 
4.0%
Uppercase Letter
ValueCountFrequency (%)
C8308
41.6%
A8308
41.6%
U1686
 
8.4%
S1686
 
8.4%
Dash Punctuation
ValueCountFrequency (%)
-19988
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common119928
85.7%
Latin19988
 
14.3%

Most frequent character per script

Common
ValueCountFrequency (%)
125510
21.3%
-19988
16.7%
015492
12.9%
215381
12.8%
67904
 
6.6%
77438
 
6.2%
47400
 
6.2%
57338
 
6.1%
35449
 
4.5%
84042
 
3.4%
Latin
ValueCountFrequency (%)
C8308
41.6%
A8308
41.6%
U1686
 
8.4%
S1686
 
8.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII139916
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
125510
18.2%
-19988
14.3%
015492
11.1%
215381
11.0%
C8308
 
5.9%
A8308
 
5.9%
67904
 
5.6%
77438
 
5.3%
47400
 
5.3%
57338
 
5.2%
Other values (5)16849
12.0%
Distinct1237
Distinct (%)12.4%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
Minimum2014-01-03 00:00:00
Maximum2017-12-30 00:00:00
Invalid dates0
Invalid dates (%)0.0%
2025-11-16T17:50:16.569113image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-11-16T17:50:17.071629image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct1334
Distinct (%)13.3%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
Minimum2014-01-07 00:00:00
Maximum2018-01-05 00:00:00
Invalid dates0
Invalid dates (%)0.0%
2025-11-16T17:50:17.418179image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-11-16T17:50:17.826628image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Ship Mode
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
Standard Class
5968 
Second Class
1945 
First Class
1538 
Same Day
 
543

Length

Max length14
Median length14
Mean length12.823094
Min length8

Characters and Unicode

Total characters128,154
Distinct characters18
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSecond Class
2nd rowSecond Class
3rd rowSecond Class
4th rowStandard Class
5th rowStandard Class

Common Values

ValueCountFrequency (%)
Standard Class5968
59.7%
Second Class1945
 
19.5%
First Class1538
 
15.4%
Same Day543
 
5.4%

Length

2025-11-16T17:50:18.106284image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-11-16T17:50:18.362862image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
class9451
47.3%
standard5968
29.9%
second1945
 
9.7%
first1538
 
7.7%
same543
 
2.7%
day543
 
2.7%

Most occurring characters

ValueCountFrequency (%)
a22473
17.5%
s20440
15.9%
d13881
10.8%
9994
7.8%
l9451
7.4%
C9451
7.4%
S8456
 
6.6%
n7913
 
6.2%
r7506
 
5.9%
t7506
 
5.9%
Other values (8)11083
8.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter98172
76.6%
Uppercase Letter19988
 
15.6%
Space Separator9994
 
7.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a22473
22.9%
s20440
20.8%
d13881
14.1%
l9451
9.6%
n7913
 
8.1%
r7506
 
7.6%
t7506
 
7.6%
e2488
 
2.5%
c1945
 
2.0%
o1945
 
2.0%
Other values (3)2624
 
2.7%
Uppercase Letter
ValueCountFrequency (%)
C9451
47.3%
S8456
42.3%
F1538
 
7.7%
D543
 
2.7%
Space Separator
ValueCountFrequency (%)
9994
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin118160
92.2%
Common9994
 
7.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
a22473
19.0%
s20440
17.3%
d13881
11.7%
l9451
8.0%
C9451
8.0%
S8456
 
7.2%
n7913
 
6.7%
r7506
 
6.4%
t7506
 
6.4%
e2488
 
2.1%
Other values (7)8595
 
7.3%
Common
ValueCountFrequency (%)
9994
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII128154
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a22473
17.5%
s20440
15.9%
d13881
10.8%
9994
7.8%
l9451
7.4%
C9451
7.4%
S8456
 
6.6%
n7913
 
6.2%
r7506
 
5.9%
t7506
 
5.9%
Other values (8)11083
8.6%
Distinct793
Distinct (%)7.9%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
2025-11-16T17:50:18.853244image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters79,952
Distinct characters40
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)0.1%

Sample

1st rowCG-12520
2nd rowCG-12520
3rd rowDV-13045
4th rowSO-20335
5th rowSO-20335
ValueCountFrequency (%)
wb-2185037
 
0.4%
ma-1756034
 
0.3%
pp-1895534
 
0.3%
jl-1583534
 
0.3%
ck-1220532
 
0.3%
sv-2036532
 
0.3%
jd-1589532
 
0.3%
eh-1376532
 
0.3%
zc-2191031
 
0.3%
ep-1391531
 
0.3%
Other values (783)9665
96.7%
2025-11-16T17:50:19.487224image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
111915
14.9%
-9994
12.5%
08532
 
10.7%
57865
 
9.8%
24682
 
5.9%
72931
 
3.7%
62909
 
3.6%
92904
 
3.6%
82818
 
3.5%
32779
 
3.5%
Other values (30)22623
28.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number49970
62.5%
Uppercase Letter19945
 
24.9%
Dash Punctuation9994
 
12.5%
Lowercase Letter43
 
0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S1798
 
9.0%
C1725
 
8.6%
M1712
 
8.6%
B1642
 
8.2%
D1296
 
6.5%
A1227
 
6.2%
J1134
 
5.7%
P1105
 
5.5%
H968
 
4.9%
K932
 
4.7%
Other values (16)6406
32.1%
Decimal Number
ValueCountFrequency (%)
111915
23.8%
08532
17.1%
57865
15.7%
24682
 
9.4%
72931
 
5.9%
62909
 
5.8%
92904
 
5.8%
82818
 
5.6%
32779
 
5.6%
42635
 
5.3%
Lowercase Letter
ValueCountFrequency (%)
p29
67.4%
o8
 
18.6%
l6
 
14.0%
Dash Punctuation
ValueCountFrequency (%)
-9994
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common59964
75.0%
Latin19988
 
25.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
S1798
 
9.0%
C1725
 
8.6%
M1712
 
8.6%
B1642
 
8.2%
D1296
 
6.5%
A1227
 
6.1%
J1134
 
5.7%
P1105
 
5.5%
H968
 
4.8%
K932
 
4.7%
Other values (19)6449
32.3%
Common
ValueCountFrequency (%)
111915
19.9%
-9994
16.7%
08532
14.2%
57865
13.1%
24682
 
7.8%
72931
 
4.9%
62909
 
4.9%
92904
 
4.8%
82818
 
4.7%
32779
 
4.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII79952
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
111915
14.9%
-9994
12.5%
08532
 
10.7%
57865
 
9.8%
24682
 
5.9%
72931
 
3.7%
62909
 
3.6%
92904
 
3.6%
82818
 
3.5%
32779
 
3.5%
Other values (30)22623
28.3%
Distinct793
Distinct (%)7.9%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
2025-11-16T17:50:19.886679image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length22
Median length18
Mean length12.960676
Min length7

Characters and Unicode

Total characters129,529
Distinct characters57
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)0.1%

Sample

1st rowClaire Gute
2nd rowClaire Gute
3rd rowDarrin Van Huff
4th rowSean O'Donnell
5th rowSean O'Donnell
ValueCountFrequency (%)
michael120
 
0.6%
frank112
 
0.6%
john107
 
0.5%
patrick96
 
0.5%
brian93
 
0.5%
stewart93
 
0.5%
paul92
 
0.5%
ken91
 
0.5%
rick91
 
0.5%
matt86
 
0.4%
Other values (901)19072
95.1%
2025-11-16T17:50:20.547733image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a12011
 
9.3%
e11836
 
9.1%
n10241
 
7.9%
10059
 
7.8%
r9530
 
7.4%
i7919
 
6.1%
l6494
 
5.0%
o5850
 
4.5%
t5435
 
4.2%
s4546
 
3.5%
Other values (47)45608
35.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter98856
76.3%
Uppercase Letter20461
 
15.8%
Space Separator10059
 
7.8%
Other Punctuation124
 
0.1%
Dash Punctuation29
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a12011
12.1%
e11836
12.0%
n10241
10.4%
r9530
9.6%
i7919
 
8.0%
l6494
 
6.6%
o5850
 
5.9%
t5435
 
5.5%
s4546
 
4.6%
h3857
 
3.9%
Other values (18)21137
21.4%
Uppercase Letter
ValueCountFrequency (%)
C1830
 
8.9%
S1798
 
8.8%
M1749
 
8.5%
B1696
 
8.3%
D1325
 
6.5%
A1282
 
6.3%
J1134
 
5.5%
P1105
 
5.4%
H1005
 
4.9%
K964
 
4.7%
Other values (16)6573
32.1%
Space Separator
ValueCountFrequency (%)
10059
100.0%
Other Punctuation
ValueCountFrequency (%)
'124
100.0%
Dash Punctuation
ValueCountFrequency (%)
-29
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin119317
92.1%
Common10212
 
7.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
a12011
 
10.1%
e11836
 
9.9%
n10241
 
8.6%
r9530
 
8.0%
i7919
 
6.6%
l6494
 
5.4%
o5850
 
4.9%
t5435
 
4.6%
s4546
 
3.8%
h3857
 
3.2%
Other values (44)41598
34.9%
Common
ValueCountFrequency (%)
10059
98.5%
'124
 
1.2%
-29
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII129440
99.9%
None89
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a12011
 
9.3%
e11836
 
9.1%
n10241
 
7.9%
10059
 
7.8%
r9530
 
7.4%
i7919
 
6.1%
l6494
 
5.0%
o5850
 
4.5%
t5435
 
4.2%
s4546
 
3.5%
Other values (44)45519
35.2%
None
ValueCountFrequency (%)
ö61
68.5%
ä23
 
25.8%
ü5
 
5.6%

Segment
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
Consumer
5191 
Corporate
3020 
Home Office
1783 

Length

Max length11
Median length8
Mean length8.8374024
Min length8

Characters and Unicode

Total characters88,321
Distinct characters17
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowConsumer
2nd rowConsumer
3rd rowCorporate
4th rowConsumer
5th rowConsumer

Common Values

ValueCountFrequency (%)
Consumer5191
51.9%
Corporate3020
30.2%
Home Office1783
 
17.8%

Length

2025-11-16T17:50:20.763243image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-11-16T17:50:20.922491image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
consumer5191
44.1%
corporate3020
25.6%
home1783
 
15.1%
office1783
 
15.1%

Most occurring characters

ValueCountFrequency (%)
o13014
14.7%
e11777
13.3%
r11231
12.7%
C8211
9.3%
m6974
7.9%
n5191
 
5.9%
s5191
 
5.9%
u5191
 
5.9%
f3566
 
4.0%
t3020
 
3.4%
Other values (7)14955
16.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter74761
84.6%
Uppercase Letter11777
 
13.3%
Space Separator1783
 
2.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o13014
17.4%
e11777
15.8%
r11231
15.0%
m6974
9.3%
n5191
 
6.9%
s5191
 
6.9%
u5191
 
6.9%
f3566
 
4.8%
t3020
 
4.0%
p3020
 
4.0%
Other values (3)6586
8.8%
Uppercase Letter
ValueCountFrequency (%)
C8211
69.7%
H1783
 
15.1%
O1783
 
15.1%
Space Separator
ValueCountFrequency (%)
1783
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin86538
98.0%
Common1783
 
2.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
o13014
15.0%
e11777
13.6%
r11231
13.0%
C8211
9.5%
m6974
8.1%
n5191
 
6.0%
s5191
 
6.0%
u5191
 
6.0%
f3566
 
4.1%
t3020
 
3.5%
Other values (6)13172
15.2%
Common
ValueCountFrequency (%)
1783
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII88321
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o13014
14.7%
e11777
13.3%
r11231
12.7%
C8211
9.3%
m6974
7.9%
n5191
 
5.9%
s5191
 
5.9%
u5191
 
5.9%
f3566
 
4.0%
t3020
 
3.4%
Other values (7)14955
16.9%

Country
Categorical

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
United States
9994 

Length

Max length13
Median length13
Mean length13
Min length13

Characters and Unicode

Total characters129,922
Distinct characters10
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUnited States
2nd rowUnited States
3rd rowUnited States
4th rowUnited States
5th rowUnited States

Common Values

ValueCountFrequency (%)
United States9994
100.0%

Length

2025-11-16T17:50:21.368228image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-11-16T17:50:21.505086image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
united9994
50.0%
states9994
50.0%

Most occurring characters

ValueCountFrequency (%)
t29982
23.1%
e19988
15.4%
U9994
 
7.7%
n9994
 
7.7%
i9994
 
7.7%
d9994
 
7.7%
9994
 
7.7%
S9994
 
7.7%
a9994
 
7.7%
s9994
 
7.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter99940
76.9%
Uppercase Letter19988
 
15.4%
Space Separator9994
 
7.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t29982
30.0%
e19988
20.0%
n9994
 
10.0%
i9994
 
10.0%
d9994
 
10.0%
a9994
 
10.0%
s9994
 
10.0%
Uppercase Letter
ValueCountFrequency (%)
U9994
50.0%
S9994
50.0%
Space Separator
ValueCountFrequency (%)
9994
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin119928
92.3%
Common9994
 
7.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
t29982
25.0%
e19988
16.7%
U9994
 
8.3%
n9994
 
8.3%
i9994
 
8.3%
d9994
 
8.3%
S9994
 
8.3%
a9994
 
8.3%
s9994
 
8.3%
Common
ValueCountFrequency (%)
9994
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII129922
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t29982
23.1%
e19988
15.4%
U9994
 
7.7%
n9994
 
7.7%
i9994
 
7.7%
d9994
 
7.7%
9994
 
7.7%
S9994
 
7.7%
a9994
 
7.7%
s9994
 
7.7%

City
Text

Distinct531
Distinct (%)5.3%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
2025-11-16T17:50:21.975990image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length17
Median length14
Mean length9.3306984
Min length4

Characters and Unicode

Total characters93,251
Distinct characters51
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique70 ?
Unique (%)0.7%

Sample

1st rowHenderson
2nd rowHenderson
3rd rowLos Angeles
4th rowFort Lauderdale
5th rowFort Lauderdale
ValueCountFrequency (%)
city994
 
7.0%
new937
 
6.6%
york920
 
6.5%
san805
 
5.7%
los747
 
5.2%
angeles747
 
5.2%
philadelphia537
 
3.8%
francisco510
 
3.6%
seattle428
 
3.0%
houston377
 
2.6%
Other values (555)7234
50.8%
2025-11-16T17:50:22.682232image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e8719
 
9.4%
a7591
 
8.1%
o7499
 
8.0%
i6229
 
6.7%
n6199
 
6.6%
l5986
 
6.4%
s4699
 
5.0%
r4468
 
4.8%
t4438
 
4.8%
4242
 
4.5%
Other values (41)33181
35.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter74773
80.2%
Uppercase Letter14236
 
15.3%
Space Separator4242
 
4.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e8719
11.7%
a7591
10.2%
o7499
10.0%
i6229
 
8.3%
n6199
 
8.3%
l5986
 
8.0%
s4699
 
6.3%
r4468
 
6.0%
t4438
 
5.9%
c2393
 
3.2%
Other values (16)16552
22.1%
Uppercase Letter
ValueCountFrequency (%)
C2085
14.6%
S1740
12.2%
L1295
9.1%
A1242
8.7%
N1134
8.0%
P1013
 
7.1%
Y940
 
6.6%
F794
 
5.6%
D627
 
4.4%
H617
 
4.3%
Other values (14)2749
19.3%
Space Separator
ValueCountFrequency (%)
4242
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin89009
95.5%
Common4242
 
4.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
e8719
 
9.8%
a7591
 
8.5%
o7499
 
8.4%
i6229
 
7.0%
n6199
 
7.0%
l5986
 
6.7%
s4699
 
5.3%
r4468
 
5.0%
t4438
 
5.0%
c2393
 
2.7%
Other values (40)30788
34.6%
Common
ValueCountFrequency (%)
4242
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII93251
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e8719
 
9.4%
a7591
 
8.1%
o7499
 
8.0%
i6229
 
6.7%
n6199
 
6.6%
l5986
 
6.4%
s4699
 
5.0%
r4468
 
4.8%
t4438
 
4.8%
4242
 
4.5%
Other values (41)33181
35.6%

State
Categorical

High correlation 

Distinct49
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
California
2001 
New York
1128 
Texas
985 
Pennsylvania
587 
Washington
506 
Other values (44)
4787 

Length

Max length20
Median length14
Mean length8.4871923
Min length4

Characters and Unicode

Total characters84,821
Distinct characters46
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowKentucky
2nd rowKentucky
3rd rowCalifornia
4th rowFlorida
5th rowFlorida

Common Values

ValueCountFrequency (%)
California2001
20.0%
New York1128
 
11.3%
Texas985
 
9.9%
Pennsylvania587
 
5.9%
Washington506
 
5.1%
Illinois492
 
4.9%
Ohio469
 
4.7%
Florida383
 
3.8%
Michigan255
 
2.6%
North Carolina249
 
2.5%
Other values (39)2939
29.4%

Length

2025-11-16T17:50:22.912960image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
california2001
17.1%
new1322
 
11.3%
york1128
 
9.6%
texas985
 
8.4%
pennsylvania587
 
5.0%
washington506
 
4.3%
illinois492
 
4.2%
ohio469
 
4.0%
florida383
 
3.3%
carolina291
 
2.5%
Other values (43)3542
30.3%

Most occurring characters

ValueCountFrequency (%)
a10758
12.7%
i9895
11.7%
n8090
 
9.5%
o7323
 
8.6%
r5544
 
6.5%
e5051
 
6.0%
l4822
 
5.7%
s4604
 
5.4%
C2566
 
3.0%
f2011
 
2.4%
Other values (36)24157
28.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter71413
84.2%
Uppercase Letter11696
 
13.8%
Space Separator1712
 
2.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a10758
15.1%
i9895
13.9%
n8090
11.3%
o7323
10.3%
r5544
7.8%
e5051
7.1%
l4822
6.8%
s4604
6.4%
f2011
 
2.8%
h1898
 
2.7%
Other values (14)11417
16.0%
Uppercase Letter
ValueCountFrequency (%)
C2566
21.9%
N1655
14.2%
T1168
10.0%
Y1128
9.6%
M763
 
6.5%
I748
 
6.4%
O659
 
5.6%
W621
 
5.3%
P587
 
5.0%
F383
 
3.3%
Other values (11)1418
12.1%
Space Separator
ValueCountFrequency (%)
1712
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin83109
98.0%
Common1712
 
2.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a10758
12.9%
i9895
11.9%
n8090
 
9.7%
o7323
 
8.8%
r5544
 
6.7%
e5051
 
6.1%
l4822
 
5.8%
s4604
 
5.5%
C2566
 
3.1%
f2011
 
2.4%
Other values (35)22445
27.0%
Common
ValueCountFrequency (%)
1712
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII84821
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a10758
12.7%
i9895
11.7%
n8090
 
9.5%
o7323
 
8.6%
r5544
 
6.5%
e5051
 
6.0%
l4822
 
5.7%
s4604
 
5.4%
C2566
 
3.0%
f2011
 
2.4%
Other values (36)24157
28.5%

Postal Code
Real number (ℝ)

High correlation 

Distinct631
Distinct (%)6.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean55190.379
Minimum1040
Maximum99301
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size78.2 KiB
2025-11-16T17:50:23.328659image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum1040
5-th percentile10009
Q123223
median56430.5
Q390008
95-th percentile98006
Maximum99301
Range98261
Interquartile range (IQR)66785

Descriptive statistics

Standard deviation32063.693
Coefficient of variation (CV)0.58096526
Kurtosis-1.4930202
Mean55190.379
Median Absolute Deviation (MAD)33573.5
Skewness-0.12852552
Sum5.5157265 × 108
Variance1.0280804 × 109
MonotonicityNot monotonic
2025-11-16T17:50:23.629161image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10035263
 
2.6%
10024230
 
2.3%
10009229
 
2.3%
94122203
 
2.0%
10011193
 
1.9%
94110166
 
1.7%
98105165
 
1.7%
19134160
 
1.6%
98103151
 
1.5%
90049151
 
1.5%
Other values (621)8083
80.9%
ValueCountFrequency (%)
10401
 
< 0.1%
14536
 
0.1%
17522
 
< 0.1%
18104
 
< 0.1%
184133
0.3%
185216
0.2%
19153
 
< 0.1%
203817
0.2%
21386
 
0.1%
21483
 
< 0.1%
ValueCountFrequency (%)
993016
 
0.1%
992077
 
0.1%
986615
 
0.1%
986323
 
< 0.1%
985025
 
0.1%
982702
 
< 0.1%
982263
 
< 0.1%
982081
 
< 0.1%
981987
 
0.1%
98115112
1.1%

Region
Categorical

High correlation 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
West
3203 
East
2848 
Central
2323 
South
1620 

Length

Max length7
Median length4
Mean length4.8594156
Min length4

Characters and Unicode

Total characters48,565
Distinct characters14
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSouth
2nd rowSouth
3rd rowWest
4th rowSouth
5th rowSouth

Common Values

ValueCountFrequency (%)
West3203
32.0%
East2848
28.5%
Central2323
23.2%
South1620
16.2%

Length

2025-11-16T17:50:23.890750image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-11-16T17:50:24.061842image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
west3203
32.0%
east2848
28.5%
central2323
23.2%
south1620
16.2%

Most occurring characters

ValueCountFrequency (%)
t9994
20.6%
s6051
12.5%
e5526
11.4%
a5171
10.6%
W3203
 
6.6%
E2848
 
5.9%
C2323
 
4.8%
n2323
 
4.8%
r2323
 
4.8%
l2323
 
4.8%
Other values (4)6480
13.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter38571
79.4%
Uppercase Letter9994
 
20.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t9994
25.9%
s6051
15.7%
e5526
14.3%
a5171
13.4%
n2323
 
6.0%
r2323
 
6.0%
l2323
 
6.0%
o1620
 
4.2%
u1620
 
4.2%
h1620
 
4.2%
Uppercase Letter
ValueCountFrequency (%)
W3203
32.0%
E2848
28.5%
C2323
23.2%
S1620
16.2%

Most occurring scripts

ValueCountFrequency (%)
Latin48565
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
t9994
20.6%
s6051
12.5%
e5526
11.4%
a5171
10.6%
W3203
 
6.6%
E2848
 
5.9%
C2323
 
4.8%
n2323
 
4.8%
r2323
 
4.8%
l2323
 
4.8%
Other values (4)6480
13.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII48565
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t9994
20.6%
s6051
12.5%
e5526
11.4%
a5171
10.6%
W3203
 
6.6%
E2848
 
5.9%
C2323
 
4.8%
n2323
 
4.8%
r2323
 
4.8%
l2323
 
4.8%
Other values (4)6480
13.3%
Distinct1862
Distinct (%)18.6%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
2025-11-16T17:50:24.361204image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length15
Median length15
Mean length15
Min length15

Characters and Unicode

Total characters149,910
Distinct characters27
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique91 ?
Unique (%)0.9%

Sample

1st rowFUR-BO-10001798
2nd rowFUR-CH-10000454
3rd rowOFF-LA-10000240
4th rowFUR-TA-10000577
5th rowOFF-ST-10000760
ValueCountFrequency (%)
off-pa-1000197019
 
0.2%
tec-ac-1000383218
 
0.2%
fur-fu-1000427016
 
0.2%
fur-ch-1000114615
 
0.2%
tec-ac-1000362815
 
0.2%
fur-ch-1000264715
 
0.2%
tec-ac-1000204915
 
0.2%
off-pa-1000237714
 
0.1%
fur-ch-1000377414
 
0.1%
off-bi-1000202614
 
0.1%
Other values (1852)9839
98.4%
2025-11-16T17:50:24.841538image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
035052
23.4%
-19988
13.3%
F15347
10.2%
114995
10.0%
O6322
 
4.2%
24862
 
3.2%
44831
 
3.2%
34805
 
3.2%
A4422
 
2.9%
53401
 
2.3%
Other values (17)35885
23.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number79952
53.3%
Uppercase Letter49970
33.3%
Dash Punctuation19988
 
13.3%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
F15347
30.7%
O6322
12.7%
A4422
 
8.8%
C3307
 
6.6%
U3268
 
6.5%
T3012
 
6.0%
R2917
 
5.8%
P2725
 
5.5%
E2101
 
4.2%
B1751
 
3.5%
Other values (6)4798
 
9.6%
Decimal Number
ValueCountFrequency (%)
035052
43.8%
114995
18.8%
24862
 
6.1%
44831
 
6.0%
34805
 
6.0%
53401
 
4.3%
73103
 
3.9%
93051
 
3.8%
62999
 
3.8%
82853
 
3.6%
Dash Punctuation
ValueCountFrequency (%)
-19988
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common99940
66.7%
Latin49970
33.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
F15347
30.7%
O6322
12.7%
A4422
 
8.8%
C3307
 
6.6%
U3268
 
6.5%
T3012
 
6.0%
R2917
 
5.8%
P2725
 
5.5%
E2101
 
4.2%
B1751
 
3.5%
Other values (6)4798
 
9.6%
Common
ValueCountFrequency (%)
035052
35.1%
-19988
20.0%
114995
15.0%
24862
 
4.9%
44831
 
4.8%
34805
 
4.8%
53401
 
3.4%
73103
 
3.1%
93051
 
3.1%
62999
 
3.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII149910
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
035052
23.4%
-19988
13.3%
F15347
10.2%
114995
10.0%
O6322
 
4.2%
24862
 
3.2%
44831
 
3.2%
34805
 
3.2%
A4422
 
2.9%
53401
 
2.3%
Other values (17)35885
23.9%

Category
Categorical

High correlation 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
Office Supplies
6026 
Furniture
2121 
Technology
1847 

Length

Max length15
Median length15
Mean length12.802582
Min length9

Characters and Unicode

Total characters127,949
Distinct characters20
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFurniture
2nd rowFurniture
3rd rowOffice Supplies
4th rowFurniture
5th rowOffice Supplies

Common Values

ValueCountFrequency (%)
Office Supplies6026
60.3%
Furniture2121
 
21.2%
Technology1847
 
18.5%

Length

2025-11-16T17:50:25.041301image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-11-16T17:50:25.196288image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
office6026
37.6%
supplies6026
37.6%
furniture2121
 
13.2%
technology1847
 
11.5%

Most occurring characters

ValueCountFrequency (%)
e16020
12.5%
i14173
11.1%
p12052
9.4%
f12052
9.4%
u10268
 
8.0%
c7873
 
6.2%
l7873
 
6.2%
O6026
 
4.7%
s6026
 
4.7%
S6026
 
4.7%
Other values (10)29560
23.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter105903
82.8%
Uppercase Letter16020
 
12.5%
Space Separator6026
 
4.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e16020
15.1%
i14173
13.4%
p12052
11.4%
f12052
11.4%
u10268
9.7%
c7873
7.4%
l7873
7.4%
s6026
 
5.7%
r4242
 
4.0%
n3968
 
3.7%
Other values (5)11356
10.7%
Uppercase Letter
ValueCountFrequency (%)
O6026
37.6%
S6026
37.6%
F2121
 
13.2%
T1847
 
11.5%
Space Separator
ValueCountFrequency (%)
6026
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin121923
95.3%
Common6026
 
4.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
e16020
13.1%
i14173
11.6%
p12052
9.9%
f12052
9.9%
u10268
8.4%
c7873
 
6.5%
l7873
 
6.5%
O6026
 
4.9%
s6026
 
4.9%
S6026
 
4.9%
Other values (9)23534
19.3%
Common
ValueCountFrequency (%)
6026
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII127949
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e16020
12.5%
i14173
11.1%
p12052
9.4%
f12052
9.4%
u10268
 
8.0%
c7873
 
6.2%
l7873
 
6.2%
O6026
 
4.7%
s6026
 
4.7%
S6026
 
4.7%
Other values (10)29560
23.1%

Sub-Category
Categorical

High correlation 

Distinct17
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
Binders
1523 
Paper
1370 
Furnishings
957 
Phones
889 
Storage
846 
Other values (12)
4409 

Length

Max length11
Median length9
Mean length7.191715
Min length3

Characters and Unicode

Total characters71,874
Distinct characters28
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowBookcases
2nd rowChairs
3rd rowLabels
4th rowTables
5th rowStorage

Common Values

ValueCountFrequency (%)
Binders1523
15.2%
Paper1370
13.7%
Furnishings957
9.6%
Phones889
8.9%
Storage846
8.5%
Art796
8.0%
Accessories775
7.8%
Chairs617
6.2%
Appliances466
 
4.7%
Labels364
 
3.6%
Other values (7)1391
13.9%

Length

2025-11-16T17:50:25.424181image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
binders1523
15.2%
paper1370
13.7%
furnishings957
9.6%
phones889
8.9%
storage846
8.5%
art796
8.0%
accessories775
7.8%
chairs617
6.2%
appliances466
 
4.7%
labels364
 
3.6%
Other values (7)1391
13.9%

Most occurring characters

ValueCountFrequency (%)
s9934
13.8%
e8870
12.3%
r7169
 
10.0%
i5668
 
7.9%
n5378
 
7.5%
a4542
 
6.3%
o3288
 
4.6%
p3004
 
4.2%
h2578
 
3.6%
c2359
 
3.3%
Other values (18)19084
26.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter61880
86.1%
Uppercase Letter9994
 
13.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s9934
16.1%
e8870
14.3%
r7169
11.6%
i5668
9.2%
n5378
8.7%
a4542
7.3%
o3288
 
5.3%
p3004
 
4.9%
h2578
 
4.2%
c2359
 
3.8%
Other values (8)9090
14.7%
Uppercase Letter
ValueCountFrequency (%)
P2259
22.6%
A2037
20.4%
B1751
17.5%
F1174
11.7%
S1036
10.4%
C685
 
6.9%
L364
 
3.6%
T319
 
3.2%
E254
 
2.5%
M115
 
1.2%

Most occurring scripts

ValueCountFrequency (%)
Latin71874
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
s9934
13.8%
e8870
12.3%
r7169
 
10.0%
i5668
 
7.9%
n5378
 
7.5%
a4542
 
6.3%
o3288
 
4.6%
p3004
 
4.2%
h2578
 
3.6%
c2359
 
3.3%
Other values (18)19084
26.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII71874
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s9934
13.8%
e8870
12.3%
r7169
 
10.0%
i5668
 
7.9%
n5378
 
7.5%
a4542
 
6.3%
o3288
 
4.6%
p3004
 
4.2%
h2578
 
3.6%
c2359
 
3.3%
Other values (18)19084
26.6%
Distinct1850
Distinct (%)18.5%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
2025-11-16T17:50:25.806910image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length127
Median length78
Mean length36.91605
Min length5

Characters and Unicode

Total characters368,939
Distinct characters85
Distinct categories11 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique91 ?
Unique (%)0.9%

Sample

1st rowBush Somerset Collection Bookcase
2nd rowHon Deluxe Fabric Upholstered Stacking Chairs, Rounded Back
3rd rowSelf-Adhesive Address Labels for Typewriters by Universal
4th rowBretford CR4500 Series Slim Rectangular Table
5th rowEldon Fold 'N Roll Cart System
ValueCountFrequency (%)
xerox865
 
1.5%
x701
 
1.3%
599
 
1.1%
with599
 
1.1%
avery557
 
1.0%
for539
 
1.0%
binders524
 
0.9%
chair479
 
0.9%
black426
 
0.8%
phone374
 
0.7%
Other values (2798)50371
89.9%
2025-11-16T17:50:26.489204image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
45670
 
12.4%
e33538
 
9.1%
r20791
 
5.6%
o19902
 
5.4%
a19064
 
5.2%
i18648
 
5.1%
l16365
 
4.4%
n15622
 
4.2%
s14683
 
4.0%
t14550
 
3.9%
Other values (75)150106
40.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter238253
64.6%
Uppercase Letter56270
 
15.3%
Space Separator46097
 
12.5%
Decimal Number17981
 
4.9%
Other Punctuation7152
 
1.9%
Dash Punctuation2940
 
0.8%
Control86
 
< 0.1%
Close Punctuation60
 
< 0.1%
Open Punctuation60
 
< 0.1%
Math Symbol35
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e33538
14.1%
r20791
 
8.7%
o19902
 
8.4%
a19064
 
8.0%
i18648
 
7.8%
l16365
 
6.9%
n15622
 
6.6%
s14683
 
6.2%
t14550
 
6.1%
c8924
 
3.7%
Other values (18)56166
23.6%
Uppercase Letter
ValueCountFrequency (%)
S6281
 
11.2%
C6007
 
10.7%
B5530
 
9.8%
P4918
 
8.7%
A2948
 
5.2%
D2941
 
5.2%
M2870
 
5.1%
T2616
 
4.6%
F2510
 
4.5%
L2284
 
4.1%
Other values (16)17365
30.9%
Other Punctuation
ValueCountFrequency (%)
,3120
43.6%
/1561
21.8%
"1300
18.2%
.463
 
6.5%
&287
 
4.0%
'257
 
3.6%
#90
 
1.3%
%45
 
0.6%
*9
 
0.1%
!9
 
0.1%
Other values (2)11
 
0.2%
Decimal Number
ValueCountFrequency (%)
13783
21.0%
02921
16.2%
22270
12.6%
41725
9.6%
31530
8.5%
51443
 
8.0%
81254
 
7.0%
91234
 
6.9%
6941
 
5.2%
7880
 
4.9%
Space Separator
ValueCountFrequency (%)
45670
99.1%
 427
 
0.9%
Control
ValueCountFrequency (%)
”67
77.9%
“19
 
22.1%
Dash Punctuation
ValueCountFrequency (%)
-2940
100.0%
Close Punctuation
ValueCountFrequency (%)
)60
100.0%
Open Punctuation
ValueCountFrequency (%)
(60
100.0%
Math Symbol
ValueCountFrequency (%)
+35
100.0%
Other Number
ValueCountFrequency (%)
¾5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin294523
79.8%
Common74416
 
20.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
e33538
 
11.4%
r20791
 
7.1%
o19902
 
6.8%
a19064
 
6.5%
i18648
 
6.3%
l16365
 
5.6%
n15622
 
5.3%
s14683
 
5.0%
t14550
 
4.9%
c8924
 
3.0%
Other values (44)112436
38.2%
Common
ValueCountFrequency (%)
45670
61.4%
13783
 
5.1%
,3120
 
4.2%
-2940
 
4.0%
02921
 
3.9%
22270
 
3.1%
41725
 
2.3%
/1561
 
2.1%
31530
 
2.1%
51443
 
1.9%
Other values (21)7453
 
10.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII368404
99.9%
None535
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
45670
 
12.4%
e33538
 
9.1%
r20791
 
5.6%
o19902
 
5.4%
a19064
 
5.2%
i18648
 
5.1%
l16365
 
4.4%
n15622
 
4.2%
s14683
 
4.0%
t14550
 
3.9%
Other values (69)149571
40.6%
None
ValueCountFrequency (%)
 427
79.8%
”67
 
12.5%
“19
 
3.6%
é14
 
2.6%
¾5
 
0.9%
à3
 
0.6%

Sales
Real number (ℝ)

High correlation 

Distinct5825
Distinct (%)58.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean229.858
Minimum0.444
Maximum22638.48
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size78.2 KiB
2025-11-16T17:50:26.705421image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0.444
5-th percentile4.98
Q117.28
median54.49
Q3209.94
95-th percentile956.98425
Maximum22638.48
Range22638.036
Interquartile range (IQR)192.66

Descriptive statistics

Standard deviation623.2451
Coefficient of variation (CV)2.7114353
Kurtosis305.31175
Mean229.858
Median Absolute Deviation (MAD)45.406
Skewness12.972752
Sum2297200.9
Variance388434.46
MonotonicityNot monotonic
2025-11-16T17:50:26.971736image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
12.9656
 
0.6%
19.4439
 
0.4%
15.55239
 
0.4%
25.9236
 
0.4%
10.36836
 
0.4%
32.428
 
0.3%
17.9421
 
0.2%
6.4821
 
0.2%
20.73619
 
0.2%
14.9417
 
0.2%
Other values (5815)9682
96.9%
ValueCountFrequency (%)
0.4441
 
< 0.1%
0.5561
 
< 0.1%
0.8361
 
< 0.1%
0.8521
 
< 0.1%
0.8761
 
< 0.1%
0.8981
 
< 0.1%
0.9841
 
< 0.1%
0.991
 
< 0.1%
1.0441
 
< 0.1%
1.083
< 0.1%
ValueCountFrequency (%)
22638.481
< 0.1%
17499.951
< 0.1%
13999.961
< 0.1%
11199.9681
< 0.1%
10499.971
< 0.1%
9892.741
< 0.1%
9449.951
< 0.1%
9099.931
< 0.1%
8749.951
< 0.1%
8399.9761
< 0.1%

Quantity
Real number (ℝ)

Distinct14
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.7895737
Minimum1
Maximum14
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size78.2 KiB
2025-11-16T17:50:27.187978image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median3
Q35
95-th percentile8
Maximum14
Range13
Interquartile range (IQR)3

Descriptive statistics

Standard deviation2.2251097
Coefficient of variation (CV)0.58716622
Kurtosis1.9918894
Mean3.7895737
Median Absolute Deviation (MAD)1
Skewness1.2785448
Sum37873
Variance4.9511131
MonotonicityNot monotonic
2025-11-16T17:50:27.393641image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
32409
24.1%
22402
24.0%
51230
12.3%
41191
11.9%
1899
 
9.0%
7606
 
6.1%
6572
 
5.7%
9258
 
2.6%
8257
 
2.6%
1057
 
0.6%
Other values (4)113
 
1.1%
ValueCountFrequency (%)
1899
 
9.0%
22402
24.0%
32409
24.1%
41191
11.9%
51230
12.3%
6572
 
5.7%
7606
 
6.1%
8257
 
2.6%
9258
 
2.6%
1057
 
0.6%
ValueCountFrequency (%)
1429
 
0.3%
1327
 
0.3%
1223
 
0.2%
1134
 
0.3%
1057
 
0.6%
9258
 
2.6%
8257
 
2.6%
7606
6.1%
6572
5.7%
51230
12.3%

Discount
Real number (ℝ)

High correlation  Zeros 

Distinct12
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.15620272
Minimum0
Maximum0.8
Zeros4798
Zeros (%)48.0%
Negative0
Negative (%)0.0%
Memory size78.2 KiB
2025-11-16T17:50:27.570859image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0.2
Q30.2
95-th percentile0.7
Maximum0.8
Range0.8
Interquartile range (IQR)0.2

Descriptive statistics

Standard deviation0.20645197
Coefficient of variation (CV)1.3216925
Kurtosis2.4095461
Mean0.15620272
Median Absolute Deviation (MAD)0.2
Skewness1.6842947
Sum1561.09
Variance0.042622415
MonotonicityNot monotonic
2025-11-16T17:50:27.770348image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
04798
48.0%
0.23657
36.6%
0.7418
 
4.2%
0.8300
 
3.0%
0.3227
 
2.3%
0.4206
 
2.1%
0.6138
 
1.4%
0.194
 
0.9%
0.566
 
0.7%
0.1552
 
0.5%
Other values (2)38
 
0.4%
ValueCountFrequency (%)
04798
48.0%
0.194
 
0.9%
0.1552
 
0.5%
0.23657
36.6%
0.3227
 
2.3%
0.3227
 
0.3%
0.4206
 
2.1%
0.4511
 
0.1%
0.566
 
0.7%
0.6138
 
1.4%
ValueCountFrequency (%)
0.8300
 
3.0%
0.7418
 
4.2%
0.6138
 
1.4%
0.566
 
0.7%
0.4511
 
0.1%
0.4206
 
2.1%
0.3227
 
0.3%
0.3227
 
2.3%
0.23657
36.6%
0.1552
 
0.5%

Profit
Real number (ℝ)

High correlation 

Distinct7287
Distinct (%)72.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean28.656896
Minimum-6599.978
Maximum8399.976
Zeros65
Zeros (%)0.7%
Negative1871
Negative (%)18.7%
Memory size78.2 KiB
2025-11-16T17:50:28.020002image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum-6599.978
5-th percentile-53.03092
Q11.72875
median8.6665
Q329.364
95-th percentile168.4704
Maximum8399.976
Range14999.954
Interquartile range (IQR)27.63525

Descriptive statistics

Standard deviation234.26011
Coefficient of variation (CV)8.1746504
Kurtosis397.18851
Mean28.656896
Median Absolute Deviation (MAD)10.77855
Skewness7.5614316
Sum286397.02
Variance54877.798
MonotonicityNot monotonic
2025-11-16T17:50:28.302876image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
065
 
0.7%
6.220843
 
0.4%
9.331238
 
0.4%
5.443232
 
0.3%
3.628832
 
0.3%
15.55226
 
0.3%
12.441621
 
0.2%
7.257619
 
0.2%
3.110418
 
0.2%
9.07211
 
0.1%
Other values (7277)9689
96.9%
ValueCountFrequency (%)
-6599.9781
< 0.1%
-3839.99041
< 0.1%
-3701.89281
< 0.1%
-3399.981
< 0.1%
-2929.48451
< 0.1%
-2639.99121
< 0.1%
-2287.7821
< 0.1%
-1862.31241
< 0.1%
-1850.94641
< 0.1%
-1811.07841
< 0.1%
ValueCountFrequency (%)
8399.9761
< 0.1%
6719.98081
< 0.1%
5039.98561
< 0.1%
4946.371
< 0.1%
4630.47551
< 0.1%
3919.98881
< 0.1%
3177.4751
< 0.1%
2799.9841
< 0.1%
2591.95681
< 0.1%
2504.22161
< 0.1%

Interactions

2025-11-16T17:50:11.579350image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-11-16T17:50:04.514408image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-11-16T17:50:06.240504image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-11-16T17:50:07.385900image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-11-16T17:50:08.689733image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-11-16T17:50:10.023048image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-11-16T17:50:11.814421image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-11-16T17:50:05.064533image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-11-16T17:50:06.424928image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-11-16T17:50:07.574765image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-11-16T17:50:08.889594image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-11-16T17:50:10.384479image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-11-16T17:50:12.044634image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-11-16T17:50:05.309240image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-11-16T17:50:06.607166image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-11-16T17:50:07.757465image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-11-16T17:50:09.118227image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-11-16T17:50:10.630960image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-11-16T17:50:12.262854image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-11-16T17:50:05.483628image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-11-16T17:50:06.790371image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-11-16T17:50:08.034183image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-11-16T17:50:09.322949image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-11-16T17:50:10.863866image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-11-16T17:50:12.494800image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-11-16T17:50:05.677706image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-11-16T17:50:06.986568image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-11-16T17:50:08.300974image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-11-16T17:50:09.542434image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-11-16T17:50:11.103751image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-11-16T17:50:12.733948image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-11-16T17:50:05.872140image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-11-16T17:50:07.186312image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-11-16T17:50:08.490908image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-11-16T17:50:09.782277image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-11-16T17:50:11.339403image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Correlations

2025-11-16T17:50:28.535766image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
CategoryDiscountPostal CodeProfitQuantityRegionRow IDSalesSegmentShip ModeStateSub-Category
Category1.0000.3770.0000.0560.0000.0000.0080.0720.0000.0000.0190.999
Discount0.3771.0000.053-0.543-0.0010.2940.013-0.0570.0050.0270.3540.353
Postal Code0.0000.0531.000-0.0050.0140.9210.011-0.0020.0350.0380.9680.000
Profit0.056-0.543-0.0051.0000.2340.021-0.0110.5180.0000.0050.0170.130
Quantity0.000-0.0010.0140.2341.0000.000-0.0020.3270.0120.0000.0040.000
Region0.0000.2940.9210.0210.0001.0000.0380.0000.0000.0220.9980.000
Row ID0.0080.0130.011-0.011-0.0020.0381.000-0.0010.0300.0500.1020.000
Sales0.072-0.057-0.0020.5180.3270.000-0.0011.0000.0020.0000.0000.142
Segment0.0000.0050.0350.0000.0120.0000.0300.0021.0000.0330.0900.000
Ship Mode0.0000.0270.0380.0050.0000.0220.0500.0000.0331.0000.0960.007
State0.0190.3540.9680.0170.0040.9980.1020.0000.0900.0961.0000.000
Sub-Category0.9990.3530.0000.1300.0000.0000.0000.1420.0000.0070.0001.000

Missing values

2025-11-16T17:50:13.293131image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
A simple visualization of nullity by column.
2025-11-16T17:50:13.725992image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

Row IDOrder IDOrder DateShip DateShip ModeCustomer IDCustomer NameSegmentCountryCityStatePostal CodeRegionProduct IDCategorySub-CategoryProduct NameSalesQuantityDiscountProfit
01CA-2016-1521562016-11-082016-11-11Second ClassCG-12520Claire GuteConsumerUnited StatesHendersonKentucky42420SouthFUR-BO-10001798FurnitureBookcasesBush Somerset Collection Bookcase261.960020.0041.9136
12CA-2016-1521562016-11-082016-11-11Second ClassCG-12520Claire GuteConsumerUnited StatesHendersonKentucky42420SouthFUR-CH-10000454FurnitureChairsHon Deluxe Fabric Upholstered Stacking Chairs, Rounded Back731.940030.00219.5820
23CA-2016-1386882016-06-122016-06-16Second ClassDV-13045Darrin Van HuffCorporateUnited StatesLos AngelesCalifornia90036WestOFF-LA-10000240Office SuppliesLabelsSelf-Adhesive Address Labels for Typewriters by Universal14.620020.006.8714
34US-2015-1089662015-10-112015-10-18Standard ClassSO-20335Sean O'DonnellConsumerUnited StatesFort LauderdaleFlorida33311SouthFUR-TA-10000577FurnitureTablesBretford CR4500 Series Slim Rectangular Table957.577550.45-383.0310
45US-2015-1089662015-10-112015-10-18Standard ClassSO-20335Sean O'DonnellConsumerUnited StatesFort LauderdaleFlorida33311SouthOFF-ST-10000760Office SuppliesStorageEldon Fold 'N Roll Cart System22.368020.202.5164
56CA-2014-1158122014-06-092014-06-14Standard ClassBH-11710Brosina HoffmanConsumerUnited StatesLos AngelesCalifornia90032WestFUR-FU-10001487FurnitureFurnishingsEldon Expressions Wood and Plastic Desk Accessories, Cherry Wood48.860070.0014.1694
67CA-2014-1158122014-06-092014-06-14Standard ClassBH-11710Brosina HoffmanConsumerUnited StatesLos AngelesCalifornia90032WestOFF-AR-10002833Office SuppliesArtNewell 3227.280040.001.9656
78CA-2014-1158122014-06-092014-06-14Standard ClassBH-11710Brosina HoffmanConsumerUnited StatesLos AngelesCalifornia90032WestTEC-PH-10002275TechnologyPhonesMitel 5320 IP Phone VoIP phone907.152060.2090.7152
89CA-2014-1158122014-06-092014-06-14Standard ClassBH-11710Brosina HoffmanConsumerUnited StatesLos AngelesCalifornia90032WestOFF-BI-10003910Office SuppliesBindersDXL Angle-View Binders with Locking Rings by Samsill18.504030.205.7825
910CA-2014-1158122014-06-092014-06-14Standard ClassBH-11710Brosina HoffmanConsumerUnited StatesLos AngelesCalifornia90032WestOFF-AP-10002892Office SuppliesAppliancesBelkin F5C206VTEL 6 Outlet Surge114.900050.0034.4700
Row IDOrder IDOrder DateShip DateShip ModeCustomer IDCustomer NameSegmentCountryCityStatePostal CodeRegionProduct IDCategorySub-CategoryProduct NameSalesQuantityDiscountProfit
99849985CA-2015-1002512015-05-172015-05-23Standard ClassDV-13465Dianna VittoriniConsumerUnited StatesLong BeachNew York11561EastOFF-LA-10003766Office SuppliesLabelsSelf-Adhesive Removable Labels31.500100.015.1200
99859986CA-2015-1002512015-05-172015-05-23Standard ClassDV-13465Dianna VittoriniConsumerUnited StatesLong BeachNew York11561EastOFF-SU-10000898Office SuppliesSuppliesAcme Hot Forged Carbon Steel Scissors with Nickel-Plated Handles, 3 7/8" Cut, 8"L55.60040.016.1240
99869987CA-2016-1257942016-09-292016-10-03Standard ClassML-17410Maris LaWareConsumerUnited StatesLos AngelesCalifornia90008WestTEC-AC-10003399TechnologyAccessoriesMemorex Mini Travel Drive 64 GB USB 2.0 Flash Drive36.24010.015.2208
99879988CA-2017-1636292017-11-172017-11-21Standard ClassRA-19885Ruben AusmanCorporateUnited StatesAthensGeorgia30605SouthTEC-AC-10001539TechnologyAccessoriesLogitech G430 Surround Sound Gaming Headset with Dolby 7.1 Technology79.99010.028.7964
99889989CA-2017-1636292017-11-172017-11-21Standard ClassRA-19885Ruben AusmanCorporateUnited StatesAthensGeorgia30605SouthTEC-PH-10004006TechnologyPhonesPanasonic KX - TS880B Telephone206.10050.055.6470
99899990CA-2014-1104222014-01-212014-01-23Second ClassTB-21400Tom BoeckenhauerConsumerUnited StatesMiamiFlorida33180SouthFUR-FU-10001889FurnitureFurnishingsUltra Door Pull Handle25.24830.24.1028
99909991CA-2017-1212582017-02-262017-03-03Standard ClassDB-13060Dave BrooksConsumerUnited StatesCosta MesaCalifornia92627WestFUR-FU-10000747FurnitureFurnishingsTenex B1-RE Series Chair Mats for Low Pile Carpets91.96020.015.6332
99919992CA-2017-1212582017-02-262017-03-03Standard ClassDB-13060Dave BrooksConsumerUnited StatesCosta MesaCalifornia92627WestTEC-PH-10003645TechnologyPhonesAastra 57i VoIP phone258.57620.219.3932
99929993CA-2017-1212582017-02-262017-03-03Standard ClassDB-13060Dave BrooksConsumerUnited StatesCosta MesaCalifornia92627WestOFF-PA-10004041Office SuppliesPaperIt's Hot Message Books with Stickers, 2 3/4" x 5"29.60040.013.3200
99939994CA-2017-1199142017-05-042017-05-09Second ClassCC-12220Chris CortesConsumerUnited StatesWestminsterCalifornia92683WestOFF-AP-10002684Office SuppliesAppliancesAcco 7-Outlet Masterpiece Power Center, Wihtout Fax/Phone Line Protection243.16020.072.9480